Korean-Chinese Person Name Translation for Cross Language Information Retrieval

نویسندگان

  • Yu-Chun Wang
  • Yi-Hsun Lee
  • Chu-Cheng Lin
  • Richard Tzong-Han Tsai
  • Wen-Lian Hsu
چکیده

Named entity translation plays an important role in many applications, such as information retrieval and machine translation. In this paper, we focus on translating person names, the most common type of name entity in Korean-Chinese cross language information retrieval (KCIR). Unlike other languages, Chinese uses characters (ideographs), which makes person name translation difficult because one syllable may map to several Chinese characters. We propose an effective hybrid person name translation method to improve the performance of KCIR. First, we use Wikipedia as a translation tool based on the inter-language links between the Korean edition and the Chinese or English editions. Second, we adopt the Naver people search engine to find the query name’s Chinese or English translation. Third, we extract Korean-English transliteration pairs from Google snippets, and then search for the English-Chinese transliteration in the database of Taiwan’s Central News Agency or in Google. The performance of KCIR using our method is over five times better than that of a dictionary-based system. The mean average precision is 0.3490 and the average recall is 0.7534. The method can deal with Chinese, Japanese, Korean, as well as non-CJK person name translation from Korean to Chinese. Hence, it substantially improves the performance of KCIR.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

KUNLP System for NTCIR-3 English-Korean Cross-Language Information Retrieval

This paper describes KUNLP system for the English-Korean cross-language information retrieval track in NTCIR-3 workshop and some experiments after the workshop. Query translation method based on the bilingual dictionary and the document language corpus was used. To automatically transliterate some proper nouns such as Korean person names, Korean place names, and Korean company names, we have co...

متن کامل

Learning Patterns from the Web to Translate Named Entities for Cross Language Information Retrieval

Named entity (NE) translation plays an important role in many applications. In this paper, we focus on translating NEs from Korean to Chinese to improve Korean-Chinese cross-language information retrieval (KCIR). The ideographic nature of Chinese makes NE translation difficult because one syllable may map to several Chinese characters. We propose a hybrid NE translation system. First, we integr...

متن کامل

Proper Name Translation in Cross-Language Information Retrieval

Recently, language barrier becomes the major problem for people to search, retrieve, and understand WWW documents in different languages. This paper deals with query translation issue in cross-language information retrieval, proper names in particular. Models for name identification, name translation and name searching are presented. The recall rates and the precision rates for the identificati...

متن کامل

NTCIR-4 Chinese, English, Korean Cross Language Retrieval Experiments Using PIRCS

In NTCIR-4 we participated in Korean, Chinese, English monolingual, Chinese-English, EnglishKorean bilingual, and Chinese-Korean cross language (using English as pivot) retrieval tasks based on our PIRCS retrieval system. The query translation approach was employed for CLIR. We combined two MT translations for Chinese-English, and two for English-Korean. For the latter, a webbased entity-orient...

متن کامل

NTCIR-5 Chinese, English, Korean Cross Language Retrieval Experiments using PIRCS

In NTCIR-5 our focus is to see if web-assisted query expansion is useful, and to test an EnglishKorean bilingual dictionary. We participated in Chinese, Japanese, Korean and English monolingual retrieval using also web expansion for Chinese and English. We also performed Chinese-English, English-Chinese, English-Korean bilingual, and Chinese-Korean pivot bilingual CLIR. The query translation ap...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007